The project aims to analyze the correlation between covid and stock data to build relative models to predict their future direction through historical data.Through this analysis and the shiny app we developed, in the future,the investments will become even higher efficiency and safer.Laying a solid foundation for global economic recovery. We tried to analyze existing historical data on the covid-19 epidemic and stocks and find relevant relationships to create a machine learning model to act as a potential trading indicator which would ultimately supplement an existing trading algorithm at predicting future performance of stocks. This aims to provide advice on buying stocks with upside potential and suggest suitable times to buy or sell. We believe that if we can successfully predict the future trend of each country’s stocks, it can help the investor to reduce the enormous economic losses caused by the epidemic to a great extent and make more money. Through our research, we found that the volatility of the daily stock prices is closely related to the data of the epidemic.
The global spread of SARS-COV2 (Covid-19) in the beginning of 2020 resulted in millions of deaths and ICU hospitalisations worldwide. The economy was severely impacted by this pandemic and as a result, public equity markets of leading countries (Australia, Japan, India, USA and China). Most equity markets reached a peak in February 2019, and dipped around March 23 as a majority of the world’s largest economy were forced into lockdown. (Seven & Yilmaz 2021) This report evaluates the impact of Covid-19 on global equity markets and provides a guide for retail traders and investors. Using data collected from owid-covid-data , Investing.com for indices, and yahoo finance for US companies. We conducted basic linear regressions and correlation matrices as analysis for the modeling on a user Shiny app to better communicate the effects of Covid-19 on relevant public equity markets. We used various modeling algorithms (KNN, RDA, RPART) in combination with existing trading algorithms as an attempt to predict the accuracies in trends between interested variables relating to Covid and the prices of equity markets to highlight any relevant relationships between the two data.
A hypothesis formulated by (Diermier, Ibbotson, & Siegel, 1984; Ibbotson & Chen, 2003) states that a stock market’s success is reliant on the success of businesses. With lockdown and strict social distancing laws in place globally, business and trade were impacted and called to a halt. These series of events layed path for negative returns, greater volatility, and higher trading volume in the global equity market (Harjoto 2021) As there are no current studies to report on the effect of stock markets and daily cases and mortality rates, through this report we aim to make preliminary findings and identify variables which are responsible for changes in equity markets and relate them to the current Covid-19 crisis to identify how this virus has caused instability in stock
df_q2 = covid_joined
colnames(df_q2)[7] = 'Change'
write.csv(df_q2,"df_q2.csv")
df_q2 = as.data.frame(df_q2[order(df_q2$Date),]) %>% drop_na(new_vaccinations, new_tests, new_cases, Price)
df_q2_cor = df_q2 %>% dplyr::select(new_vaccinations, new_tests, new_cases, Price)
We are interested in 4 variables: Price (the price of the equity market), new_vaccinations(number of people getting vaccinated everyday), new_tests(number of people who did tests everyday) and new_cases(number of new cases everyday). We selected these variables as they were present in our 5 interested countries which made them suitable for analysis. So a correlation matrix of those 4 variables is made to explore the possible correlations of them as below.
M=cor(df_q2_cor)
#qtlcharts::iplotCorr(df_q2_cor)
library(corrplot)
## corrplot 0.92 loaded
corrplot(M, method="circle")
It’s clear that there is a correlation between Price and new_vaccinations, and new_cases and new_tests. So we focus on those 2 pairs of variables furthermore.I make a 1 month rolling window and then calculating the correlation of Price and new_vaccinations, and new_cases and new_tests respectively, we got the plot below showing the trend of correlations of those variables over the epidemic time period.===
df_q2_date = df_q2 %>% group_by(Date) %>% summarize(avg_price = mean(Price), avg_new_cases = mean(new_cases), avg_new_tests = mean(new_tests), avg_new_vac = mean(new_vaccinations))
nrow_df_q2 = 360
i = 1
cor1 <- list()
time_ls <- list()
j = 1
while (i <= nrow_df_q2) {
subset <- df_q2_date[i:(i+29),]
cor1[[j]] <- cor.test(subset$avg_price,subset$avg_new_vac, method = "pearson")
time_ls[[j]] = subset$Date[1]
print(time_ls[[j]])
j = j + 1
i = i + 30
}
## [1] "2020-12-14"
## [1] "2021-01-27"
## [1] "2021-03-11"
## [1] "2021-04-22"
## [1] "2021-06-03"
## [1] "2021-07-13"
## [1] "2021-08-20"
## [1] "2021-10-01"
## [1] "2021-11-12"
## [1] "2021-12-24"
## [1] "2022-02-04"
## [1] "2022-03-18"
out <- lapply(cor1, function(x) c(x$estimate, x$conf.int, x$p.value))
D1 <- data.frame(cbind(index = seq(length(out)), do.call(rbind, out)))
names(D1)[2:ncol(D1)] <- c('estimate', paste0('conf.int', 1:2), 'p.value')
#D1
i = 1
cor2 <- list()
j = 1
while (i <= nrow_df_q2) {
subset <- df_q2_date[i:(i+29),]
cor2[[j]] <- cor.test(subset$avg_new_tests,subset$avg_new_cases, method = "pearson")
j = j + 1
i = i + 30
}
out <- lapply(cor2, function(x) c(x$estimate, x$conf.int, x$p.value))
D2 <- data.frame(cbind(index = seq(length(out)), do.call(rbind, out)))
names(D2)[2:ncol(D2)] <- c('estimate', paste0('conf.int', 1:2), 'p.value')
df_q2_date = as.data.frame(time_ls)
df_q2_date = as.data.frame(t(df_q2_date))
#df_q2_date
colnames(D1)[1] ="start_date"
colnames(D1)[2] ="avg_price_avg_new_vacc"
colnames(D1)[3] ="avg_new_tests_avg_new_cases"
#,"avg_price_avg_new_vacc","avg_new_tests_avg_new_cases"
df_q2_cor_time = D1 %>% mutate(start_date = df_q2_date$V1, avg_new_tests_avg_new_cases = D2$estimate)
#cor_df_time
df_q2_cor_time$start_date = as.Date(df_q2_cor_time$start_date)
df_q2_cor_time = df_q2_cor_time[ , -which(names(df_q2_cor_time) %in% c("conf.int2","p.value"))]
#df_q2_cor_time
p1 = ggplot(data = df_q2_cor_time, aes(x = start_date)) + geom_line(aes(y=avg_new_tests_avg_new_cases, colour = "avg new_tests ~ avg new_cases"), size = 0.8) +
geom_line(aes(y=avg_price_avg_new_vacc, colour = 'avg price ~ avg new_vacc'), size = 0.8) +
scale_colour_manual("",
values = c("avg new_tests ~ avg new_cases"="#FF00CC", "avg price ~ avg new_vacc"="#3333FF")) +
ggtitle("Price~New_vaccinations, New_tests~New_cases") + xlab("Date") +
ylab("Coorelation")+
#scale_fill_manual(values = c("light green", "yellow"))+
theme_bw()
ggplotly(p1)
A decrease in trend for the pink line (i.e. the correlation of average new tests and average new cases of each month) appears on the plot. This possibly is as a result of reduced testing as the pandemic progressed whilst the number of cases remained steady or increased, causing correlation between the 2 variables to decline. Furthermore, the blue line, depicting average new vaccinations and price, started to show a relatively steady trend towards August of 2021. We inferred that the trend might correspond to vaccination rates rising which allowed the return of normal living, thus influencing the equity markets to also return to a steady pattern. To analyze how the equity market behaved integrally during the epidemic period, we made a machine learning model. The process of making models, further analyzing and evaluation of models will be stated specifically below.
2333
Our group combines omics and Stock Quotes data to analyze the effects of Covid-19 on relevant public equity markets. We have found the prices of equity markets are closely related to this global pandemic by collecting the information from leading countries (Australia, Japan, India, USA and China) and analyzed it via the linear regressions and correlation matrices. We obtained the daily stock prices for 15 kinds of stocks from USA stock market, which is universally-acknowledged as a typical one, through a financial database from yahoo, and then using machine learning models (KNN, RDA, RPART) to make a predication. We also developed a shiny app as a guide for retail traders and investors. As a result, the users can have a clear understanding of the impact of Covid-19 on the stock market, based on which they can make wiser investment decisions.
Harjoto, M., Rossi, F., Lee, R., & Sergi, B. (2021). How do equity markets react to COVID-19? Evidence from emerging and developed countries. Journal Of Economics And Business, 115, 105966. https://doi.org/10.1016/j.jeconbus.2020.105966
Seven, Ü., & Yılmaz, F. (2021). World equity markets and COVID-19: Immediate response and recovery prospects. Research In International Business And Finance, 56, 101349. https://doi.org/10.1016/j.ribaf.2020.101349